Automated Taxonomy Generation for Summarizing Multi-Type Relational Datasets

نویسندگان

Tao Li

Sarabjot S. Anand

چکیده

Taxonomy construction provides an efficient navigating and browsing mechanism to people by organizing large amounts of information into a small number of hierarchical clusters. Compared with manually editing taxonomies, Automated Taxonomy Generation has numerous advantages and has therefore been applied to categorize document collections. However, the utility of this technique to organize and represent relational datasets has not been investigated, because of its unaffordable computational complexity. In this paper we propose a new ATG method based on the relational clustering framework DIVA. By incorporating the idea of Representative Objects, the computational complexity can be greatly reduced. Moreover, we analyze the divergence of the data attributes and label the taxonomic nodes accordingly. The quality of the derived taxonomy is quantitatively evaluated by a synthesized criterion that considers both the intra-node homogeneity and inter-node heterogeneity. Theoretical analysis and experimental results prove that our approach is comparably effective and more efficient than other ATG algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Labeling Nodes of Automatically Generated Taxonomy for Multi-type Relational Datasets

Automatic Taxonomy Generation organizes a large dataset into a hierarchical structure so as to facilitate people’s navigation and browsing actions. To better summarize the content of each node as well as to reflect the distinctiveness between sibling ones, meaningful labels need to be assigned to all the nodes within a derived taxonomy. Current research only focuses on labeling taxonomies that ...

متن کامل

Multi-type Relational Clustering Approaches: Current State-of-the-Art and New Directions

The proliferation of multi-type relational datasets in a number of important real-world applications and the limitations resulting from the transformation of such datasets to fit propositional data mining approaches have led to the emergence of the discipline of multi-type relational data mining. Clustering is an important unsupervised learning task aimed at discovering structure inherent in da...

متن کامل

Exploiting Domain Knowledge by Automated Taxonomy Generation in Recommender Systems

The effectiveness of incorporating domain knowledge into recommender systems to address their sparseness problem and improve their prediction accuracy has been discussed in many research works. However, this technique is usually restrained in practice because of its high computational expense. Although cluster analysis can alleviate the computational complexity of the recommendation procedure, ...

متن کامل

A Taxonomy of Meta-learning Techniques and Proposed Framework for Automated Landmarker Generation and Selection

Many different perspectives have been adopted regarding the form of learning labelled as metalearning, with little to no consensus as to a proper definition. As such, a general definition and taxonomy of meta-learning techniques defined, segmenting meta-learning into two categories: mono-problem and multi-problem. A further taxonomy of multi-problem metalearning methods is then described, empha...

متن کامل

Transforming Graph Representations for Statistical Relational Learning

Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of statistical relational learning (SRL) algorithms to these domains. In this article, we examine a range of representation issues for graph-based relational data. Since th...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Automated Taxonomy Generation for Summarizing Multi-Type Relational Datasets

نویسندگان

چکیده

منابع مشابه

Labeling Nodes of Automatically Generated Taxonomy for Multi-type Relational Datasets

Multi-type Relational Clustering Approaches: Current State-of-the-Art and New Directions

Exploiting Domain Knowledge by Automated Taxonomy Generation in Recommender Systems

A Taxonomy of Meta-learning Techniques and Proposed Framework for Automated Landmarker Generation and Selection

Transforming Graph Representations for Statistical Relational Learning

عنوان ژورنال:

اشتراک گذاری